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Abstract 

This paper deals with the identification of the multivariate fractional Brownian motion, a recently 
developed extension of the fractional Brownian motion to the multivariate case. This process is a p- 
multivariate self-similar Gaussian process parameterized by p different Hurst exponents Hi, p scaling 
coefficients ai (of each component) and also by p{p — 1) coefficients Pij,rjij (for i,j = 1, . . . ,p with 
j > i) allowing two components to be more or less strongly correlated and allowing the process to 
be time reversible or not. We investigate the use of discrete filtering techniques to estimate jointly 
or separately the different parameters and prove the efficiency of the methodology with a simulation 
study and the derivation of asymptotic results. 

Keywords : Self similarity ; Multivariate process ; Long-range dependence ; Discrete variations ; 
Parametric estimation. 

1 Introduction, main results 

The last decade has seen a dramatic effort of research to understand real networks, or complex networks, 
of any kind [32[ I36[ l38j. Indeed, many systems whether natural or man-made constitute networks of 
interacting systems. These networks are usually considered as complex systems, in the sense that a 
global behavior emerges from the interaction and cannot be predicted from the sole observation of the 
individuals. In general, the complexity of the system gives to measurements taken at individuals difficult 
properties such has nonstationarity, fractality, long-range dependence, . . . This for example occurs in 
functional magnetic resonance imaging (fMRI), where data collected from different parts of the brain 
are of course correlated between each other, but also present long-range dependence [3l[2]. In internet 
tomography, it is now well recognised that time series corresponding to IP packets or bytes are correlated 
and long-range dependent [T]. But mutlivariate time series depicting long-range dependence have also 
been encountered in fields as different as physics or economics [211 [5] . 

When measurements are collected simultaneously at several nodes of the networks, the global data 
set has to be modeled as a multivariate time series. Conversely, given a multivariate signal, a goal may 
be to solve an inverse problem: identification of the network underlying the multivariate measurement 
(each component is associated to a node of the network; a link between two nodes assesses for dependence 
between the components.) This problem is a problem of graphical modeling |43l I26|. To model long- 
range multivariate processes, we studied in jl2| the extension to the multivariate case of the fractional 
Brownian motion (and its increments). The mfBm is a Gaussian multivariate signal, whose components 
are correlated scalar fBm with a priori different Hurst exponents. This model is interesting for modeling 
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fMRI data. In this paper, we work for the converse problem, developing a methodology to identify the 
mfBm. 

The multivariate fractional Brownian motion is characterized by the Hurst exponents of its compo- 
nents, by its covariance matrix at time 1, and also by an antisymmetric matrix 77^- which controls the time 
assymmetry of the multivariate process. We provide here a framework to estimate all these characteris- 
tics from the observation of one sample path of the mfBm. This multivariate process is a nonstationary 
process with stationary increments. Thus in order to perform time average we work on the increments 
directly. However, as we recall in section [21 the components of the increments process may be long-range 
dependent individually, and may also present what we call long-range interdependence, meaning that 
their cross-correlation function may be not summable. This leads to considerable difficulties in the in- 
ference methods, especially implying very poor convergence rates (see |10| for example). To circumvent 
the problem in the scalar case, it is well-known that derivatives smoother than increments have to be 
considered. The most popular smooth derivative is provided by the wavelet transform when the wavelet 
is chosen to be orthogonal to polynomial of sufficient high degrees (see [TSl [111110] for early references). 
Here, we use a slightly different approach using discrete, compactly supported filters, that need to be 
orthogonal to some polynomial, but are not necessarily linked to wavelet theory (in that they do not 
necessarily are the base for a multiresolution analysis). 

The filtering is performed for dilated version of the filter with factor m. Each component of the 
multivariate signal is so filtered. We show that the cross-covariance between the components of the 
filtered version is a power law of m. This generalizes the well-known power law behavior as a function 
of scale of the variance of the wavelet coefficients in the scalar fractional Brownian motion case. Thus 
we perform a linear regression in log variables to estimate the exponents (linked to Hurst index) and the 
other parameters. 

However, since we calculate cross-covariance as well as covariances, we have an overdetermined set of 
equations to estimate the Hurst parameters. We experimentally show that it is preferable to eliminate this 
overdetermination for the estimation of the Hurst exponents. Therefore, the first conclusion of the study 
is that for the estimation of the Hurst exponents of the components, it is not advantageous to consider 
the whole multivariate process, but better to process each component separately. The second conclusion 
is the fact that the quality of these estimations is almost independent of the correlation between the 
components. Finally we illustrate the fact that the estimation of the correlation structure is easy whereas 
it is very difficult to estimate the asymmetry parameters. Our finding are based on experiments as well 
as theoretical proof of convergence of the estimators we exhibit. We show there almost sure convergence 
and provide a central limit theorem proving usual yjn convergence rate if the filters are properly chosen. 

The paper is structured as follows. We present in the following section the essential facts on the 
mfBm needed for the paper to be self consistent. We also present the filters that we are using and the 
statistical properties of the filtered mfBm. Section [3] then presents the methodology we adopt. We first 
present basic identities highlighting the power law behavior, and then discuss the least square regression 
that solve our inference problem. Section [4] is dedicated to the theoretical study of the estimators, where 
we first exhibit almost sure convergence and then prove a central limit theorem. In section [5lthen, we 
illustrate our findings using Monte-Carlo experiments, and we present an illustration of the method on a 
high dimensional example. Note that the proofs of the results are given in the last section. 

2 Multivariate fBm, Filters 

We recall here some basic facts about the multivariate fractional Brownian motion. For more information 
and proofs of the results recalled here, we refer the reader to HH UHl Ej- 
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2.1 Some facts on the mfBm 

The p dimensional multivariate fractional Brownian motion x{t) is defined as a Gaussian process having 
stationary increments and having components jointly self-similar with parameters {Hi, . . . , Hp) € (0, 1)''. 

The self-similarity property can be stated as follows: for any real A > 0, x{Xt) ^= X^x{t) where H = 

diag(i7i, . . . , Hp) and is intended in the matrix sense. The notation stands for equality of all the 
finite-dimensional probability distributions. 

Joint self-similarity imposes many constraints on the correlation structure of the process. This has 
been studied in [28j where the general form of the covariance structure of a jointly self-similar process 
with stationary increments is obtained, without recoursing to the Gaussian assumption. This form is 
further studied in [6l [12]. The covariance structure is shown to be characterized by real numbers 
Pij G (— l,l),?7y € MjCTi > 0, i = > i. Parameter at is the standard deviation of the ith 

component at time 1, pij is the correlation coefficient between the components i and j at time 1, and 
as such satisfies pij — pji. Parameters rjij are linked with the time-reversibility of the process. They 
are characterized by the antisymmetry property rjij = —rjji. In special cases, these parameters are 
known [6| I12j. If the process is time- reversible, they are all equal to zero; if the process admits a causal 
(or an anticausal) representation, they are function of pij, Hi and Hj. In general otherwise, they are 
unconstrained. 

The covariance structure of the process is as follows. The process is marginally a fractional Brownian 
motion. Thus the covariance function of the ith component is the usual function |3H 134] 



E[x,is)xM ^^{\s\'"'+\t\'"'~\t~s\'"'}, (1) 

with, as mentioned, af := var(a;j;(l)). The cross-covariances are given by [6] ([28] for the proof and a 
different parametrization) 

Proposition 1 For all {i,j) G {1, . . . ,p}^, i 7^ j , 

r,,{s,t) E[x,{s)x,{t)] (2) 

= {wij (-s) + w,j [t) - w,j (t - s)} , (3) 

where the function Wij{h) is defined by 

_ r (p,, - rj,,sign{h))\h\"^+"^ if H, + H, + 1, , . 

^ ~ 1 P^M + \H if + a, = i- ^ ' 

As shown in (6] [12] , the form obtained for Hi + Hj = 1 can be recovered by continuity from the case 
Hi + Hj 7^ 1. Furthermore, setting evidently pa — 1 and noticing that r]ii = allows us to remark that 
the definition is valid if i = j since it is equivalent to ([1]). 

The constraints on pij and 77.^ are only necessary conditions to ensure that the matrix given by ([3]) 
together with Q is the cross-covariance matrix of a process. A necessary and sufficient condition has 
been exhibited in . This condition is the positivc-dcfinitcncss of the matrix with entries 

r{H, + H, + 1) {p,, sin {^{H, + H,)) - ir?,, sin (|(i/, + H,)) , (5) 

where i = -s/— T. Interestingly, the condition emerges when studying moving average and spectral rep- 
resentations of the mfBm. For example, a moving average representation can be shown to be given by 
(assuming Hi ^ 1/2 for i = 1, . . . ,p) 
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where M^(da;) = (W^i(dx), • • • , Wp{dx)) is a Gaussian white noise with zero mean, independent compo- 
nents and covariance E[Wj(da;)Wj(da;)] = 6ijdx. For given parameters a^p^rj, we can find easily the 
terms of the matrices (see This representation is interesting since it shows that we have 
access to a whole family of different processes with different characteristics governed by the parameters. 
In this paper, we will for example particularly focus on the so-called causal and well-balanced cases, for 
which we have respectively M~ = and A/+ = M~. The case M_ = sets a close link between p^- and 
r]ij whereas the case = makes the process time-reversible leading to 77^- = 0. The problem of 
simulation of such a process has been investigated in [5] using the Chan and Wood algorithm, [5]. Figure 
([T|) presents some examples in order to illustrate the process. 



The mfBm has by definition stationary increments. It is easy to derive the covariance structure of 
the increments process. Let Ax(t) = x{t -|- 1) — x{t) be this process (with increments of size 1) that we 
will refer to the multivariate fractional Gaussian noise. Then 

7y(/i) := E[Ax^{t)Axj{t + h)] (7) 
a, a J 



Wij{h - 1) - 2w,j{h) + Wij{h + 1) j . (8) 

The asymptotic behavior has been studied in [SHH]. We have as \h\ — > +00 

7,,(/i) ^ tT,aj|/i|^-+^-2^y(sign(/i)) (9) 

with 

.,(sign(M) = ( (P.-^^Bign(M)(i^ + i^)(iJ,+i/,-l) W + (10) 
^ \ ??»jSign(/i) ifH., + Hj^l. ^ ' 

We recover here the usual behavior of the scalar fGn: each component of the mfGn can be short or long- 
range dependent if its corresponding Hurst parameter is smaller or greater than 1/2, respectively. But 
in the multivariate case, long-range (inter)dependence can also appear in the cross-covariance. Indeed, 
from ^ we easily conclude that Jijih) is not summable as soon as Hi + Hj > 1, a case which can appear 
in three situations: 

1. = 1/2 = H, 

2. Hi < 1/2 and Hj > 1 - H 

3. Hi > 1/2 and Hj > 1/2 

In those cases, some troubles may appear when it comes to infer parameters of the models from data. 
Indeed, long-range dependence may lead to very slow convergence of estimators. 

As already observed in many works [TU lUl [11] , recoursing to wavelet types of transformation is an 
elegant way to overcome the problem. Indeed, using wavelet types of transformation with a correctly 
chosen filter allows to extract the stationary part from the fBm and allows us to "whiten" the increments. 
We describe such an approach in the following section. 

2.2 Discrete filtering technique and its consequence on the mfBm 

In the identification problem, we suppose to have access to a sampled version of the mfBm. We thus turn 
to discrete time. Let £ and q be two positive integers. We consider the following set of filters Ae,q- 

Ai.q = |(afe)/cez : Ofe = 0, VA: e Z^-* U + 1, . . . ,+00} and ^k}ak = 0,V/ = 0, ...,(?- l| 
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Figure 1: Examples of discretized sample paths of a well-balanced (rjij = 0) mfBm of length n = 1024, with 
p = 20 components. The Hurst exponents are equally spaced in [0.3, 0.4] (upper plot), [0.6, 0.7] (middle plot) and 
[0.4,0.8] (bottom plot). The correlation parameters are set to 0.7 (upper and middle plot) and to 0.3 (bottom 
plot). The components are translated artificially in the upper plot for the sake of visibility. 
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Typical examples are the differenee filter S^q — di_i and its compositions, Daubechies wavelet filters, and 
any known wavelet filter with compact support and a sufficient number of vanishing moments. 
For a G Ae,q and an integer m > f we define the mth dilated version of a, say a™ as 

o-k/m if fc G 
if fc ^ toZ 

Evidently, = a and a™ S •^i,q for ^^Vf itt-- The mth dilated version is thus simply obtained by 
oversampling a by a factor of m, i.e. by adding m — I zeros between each of the first £ + I coefficients of 
the impulse response a^. 

Let x{t) be a mfBm in discrete time. We mean by this that we have at hand a collection of samples 
regularly taken from a continuous time mfBm. Let a;™ be the signal obtained by filtering x with filter 
a™. Since x is multivariate, x™ is also, and its components are the components of x filtered by a™, 
a;"(t) = where 



X being Gaussian with zero mean, is also. Now we have 

7™-"^(/.) E[xT^it)xpit + h)] 



J2 al!'apr,j{t-k,t + h-l) 



k,iei, 
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of Wij we get 



The last equation is obtained since for any member of Ae^q, ^ ^- Using the definition of a™ and 

(h) ^ ^ akai{ptj ~ i]ijSign{h + mik - 7ri2l))\h + iTiik - m2l)\"''^"' (ff) 

The behavior of 7™^'™^(/i) has been studied in [T^] in the case of the continuous wavelet analysis of the 
continuous time mfBm. The result proved in Proposition 7 of the referenced paper can be developed 
also in the same way in the case of discrete wavelet transform or in the setting used here. We thus state 
without proof the following expansion and its consequence on the summability of 17,™^'™^ (•)!" for G N*. 

Proposition 2 

(i) As \h\ — > +00, the following equivalence holds for any mi,TO2 > 1 and any a G Ae^q 



^™.--(/,) ^ ^Zl^,^a,q)\h\"'^"^-'^n,ih) 



2 

2 



where K{a,q) := {^^)imim2)'^ \J2k^'^'^k\ and 

( {p^J ' ri,,sign{h)){"^+^') tf t = j and H, ^ 1/2 
njih) ^ \ ori^i and H, + Hj =^ 1 

[ %SSr +Hj = l and m + 1/2. 

(m) Let us denote by :~ max(i7i, . . . , Hp), then for any a G N* 

q>H^ + ^ ^ 7r'"'^(-)G^"(Z),Vz,j--l,...,p. (12) 
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Choosing the fihcr Si^ — Si^i allows us to recover The interest of filtering is revealed by taking 
higher order filters. Indeed, for a filter with two zero moments, the cross-covariance will be summable 
for all the possible values of the Hurst exponents. In some sense, the filtering aims at reducing the 
dependence of the cross-covariances function along time. Let us add that the key-ingredient for obtaining 
a central limit theorem for our proposed estimators is the square summability of all the cross-covariances 
functions. As stated, in this will be realized if g = 1 and iJ^ < 3/4 or as soon as g > 2. 

We now turn to the core of the paper. 



3 Estimation method 

From now on, we assume having at our disposal a sample path of a mfBm (with p > 1 components) 
regularly sampled at times t ^ 1, . . . ,n. For the sake of simplicity, we shall also restrict ourselves on the 
most interesting case Hi + Hj ^ = 1, . . . ,p. 



3.1 Basic identities 

The estimation principle relies on the covariance (jlip for a given m = mi = m2 > 1. We have 
l^W :=7ir"W = -^ E a,ai{p,,^rj,,sign{h + m{k~l)))\h + m{k-l))\"'^"\ 

In particular, at lag we obtain 

7™(0) = m^^-af(-i ^ a,a,|fc-if^') (13) 

7™(0) - m^'+«^p,,a,a,(-i ^ afca,|fc-;|^'+''^). (14) 

To obtain ([13]), we have made use of the fact that X^I ;=o ~ 01^ ~ i^Ht+Hj _ q note that 

the parameters of interest appears in the slope of the log covariance at lag when considered as a 
function of logm. To obtain such a relation for the remaining parameters rjij, we must remember that 
these parameters characterize the time asymmetry of the process. Since for a Gaussian process, time 
reversal invariance is equivalent to "fij{h) = jji{h),yi, j , it is tempting to extract rjij from differences like 
^ij{h) — Jji{h). Indeed, we have 

-f^{m£) = m"^+"^a,crj{p,j ~ ^ ^ akai\£ + k ^ 

where we have used the fact that the filters are zero as soon as fc > ^ and thus sign(i' -t- A: — /) = 1 in the 
double sum. Let us introduce the function 

Kj{h) := -i ^ akai\h + k-l\"''^"\ 

where the indices i,j correspond to the fact that tt depends on the corresponding Hurst exponents. Let 
us underline that for all Hi,Hj e (0, 1), 7r° (0) > for any filter a. We thus have obtained the following 
equations 

7ir(0) = m^^^afnm, V* = l,...,p (15) 
7,7(0) = m"'+"^p,,<j,a,7r':^iO), - 1, . . . ,p, j > z (16) 

7™(m^)-7;^(m/) = 2m"'+"^ r^,,a,a,K^{£), Vz = 1, . . . ,p, j > z (17) 
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Equations (jl5|) and their wavelet counterparts in the scalar case have been used by many people to 
estimate the Hurst exponent {e.g. [inilllllll] to cite some but a few). The two others are direct extension 
and are going to be used in the sequel to identify the mfBm. 

At this point, the question "which parameters do we want to estimate and how" must be asked. If 
we want to only estimate Hurst parameters Hi, . . . , Hp, do we have to use only the p equations (jlSp . 
or do we gain something by adding the p{p — 1) others? The parameters H will be estimated by linear 
regression (in the log variables). Can we use these regressions to estimate the other parameters a,p and 
r], or is it better to consider usual empirical estimates? 

We try to adress all these questions in the following. 

3.2 Methodology 

We apply the filtering for all values of m taken from a discrete set A4 of cardinal and we thus obtain 
the multivariate signal a;'"(t). We then evaluate the empirical estimators 

_j n — h 

^^('^) = — E <'w^r(i+M, (18) 

n — ml — n ^ — ' 

t=me+i 

C™ (0) thus corresponding to the empirical moment of order 2 of the signal a:™ . As we may expect that 
C^{h) correctly estim ates 7™( fe), we will use this estimator to estimate the parameters of the model. 
Precisely, inspired by ()15|16|17|) . let us introduce 

vr := log Clf (0) a. := log (af (0)) , 

c™ := log I C™ (0)1 /i. log (a,a,|p,,|7r- (0)) , (19) 

:= log0.5|C-(m^) - C™(m^)| := log (a^a^.j^m ■ 

We have to underline here that it is assumed that none of the parameters pij and rjij is equal to zero. 
This could be a limitation since zero expresses the absence of correlation or the time reversibility but as 
we will see later the derived estimates of pij and r]ij actually do not depend on this assumption. Then, 
we can then write 

= 2i?,logm + ai V i^l,...,p, 
= (i?, + 7?j)logm + ^ +e™. V i = 1, . . . ,p;j > i , 
d"^ = {H, + H,)\ogm + iy, + ef^^ V i ^ 1, . . . ,p; j > t. 

The noise terms s measure the deviation of the model and can be written as 

£™ = t;r-logTir(0), 
e™. = c™-log|7™(0)|, 

= d:^-log0.5\jl]{m£)-Y;:im£)\. 

Let us now consider the vectors H = {Hi, . . . , HpY ,a = (ai, . . . , ap)*, ^ = (Mjj)i=i....,p:j>iJ = 
i^ij)i=i p j>i- 1^'^'' these two last, the ordering chosen to create a vector is of no importance. However, 
to fix ideas we will use the identification k{i,j) = {i — l)p+ j — i{i + 1) /2 which corresponds to a numbering 
following rows. In all the following, we will often switch from matrix notation to the vector one, but the 
context will make it clear. 

We suggest to obtain the above parameters by minimizing the following weighted mean square error 
objective 

f{H,a,^^,ly)^ ( E + E i^'^f + ^'i E i^Zf]- 
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The interest of this objective function is in the fact that it combines the three different types of "obser- 
vations" , empirical variances, empirical correlation and empirical measure of asymmetry. The weights 
allow us to consider the advantage of including one of these types of observation in the inference problem. 
For example, in the usual setting, we will set Wc = Wd = and this will allow us to estimate the Hurst 
exponents. Considering only Wd = allows to add in the observation the empirical correlation in the 
hope that it will ameliorate the estimation of the Hurst exponents. 
We now solve the optimization problem 

a, fl, i)) = argmin/(iJ, a, fi, v). 

The details of the calculation are provided in section (|6.ip . To write down the result, we introduce the 
vector of Rl^l, L (log mi, . . . , log m\M\Y ■ The variables Vi, Cij and dij without exponent m stand for 
the vectors of RI-^I collecting respectively w™, c™ and for m = mi, . . . , m|^|). Furthermore, define 

for any vector x € M'^' its mean x := ^mGM ^"i- ^^"-^ centered vector x = x — x. Then, the 

parameters optimizing / are given by 

dfc = Vk-2HkL, (20) 
= c,j-{H^ + Hj)L, (21) 
i>y = d,j-{H, + Hj)L, (22) 



Hi, = 



A 

(Wc + Wd) 



WcCkj + Wddkj) 

p 

{X+p{Wc+Wd)) f^^^ /J 

where A := 4w„ + (p — 2)(u'c + Wd). Note that setting Wc = Wd = ^ and Wi, = 1, we find for 

* - wv 



which is the estimator found when estimating the Hurst exponent of a scalar fBm [TT]. Equation (f23| 
appears therefore as a generalization for which the estimates are still independent of the other parameters 
(cr^, p, 77). When p = 2, w^, = Wc = 1, = 0, it takes for example the simple form: 

To conclude the identification of the mfBm, parameters oi , pij , ry^ , can be estimated by plugging 
estimators (|20l21l22p into Equation (HH). We then obtain 



e 'J 




c™(0)| y v'-S(o)-«(o) 



1^ I ^ ^^11, \cinmi)-c-{mi)\ \ ^nmm 
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where TTfj{h) stands for '!Tfj{h) in which parameters Hi are replaced by their estimator Hi. or 

The r.h.s. of and (P7)) are directly obtained from and (|21l22p . These last forms are very 
interesting, since they are not suffering from the log-transformation and therefore if the model is such that 
Pij or rjij equals zero, such parameters can still be estimated (and thus tested). Noting from (|16ll7p that 
sign(E[C(5'(0)]) = sign(7™(0)) == sign(p,j) and sign(E[C™(m£) - C^^img)]) = sign(7,'^(m^) - 7j'?(m£)) = 
sign(77ij), then estimates of pij and rjij can be defined as follows 

p,, = X sign(C™(0)) and 7% = |%| x sign(C,'^(m^) - q?(m£)), (28) 

for any m > 1. Also, letting \Ai\ = 1 allows us to recover the natural empirical estimates (obtained with 
one filter) 



. c-(fi) ^^umm „ _ g-(^)-q»(£) ^^m^^m 

^C]j(o)Cj^(o) y^C];^(o)q';(o) *?,-(^) 



In some sense, (|26l27p can be viewed, up to a factor and a sign, as the geometric mean of 



|C"(0)l 



and ( ^'^'^''"^1 N'^^rll™/"'^ I ■ Finally, note that we use n instead of n in the renormalization of the esti- 



mators, and therefore, these estimators are coupled with the regression estimators of the Hurst exponents. 



4 Convergence analysis 

We concentrate in this section on the convergence of the estimators defined by equations (pS)) . (pS]) . 
(|26l) . (f27| and We denote by 6* = (iJ*, {a^Y,p\T]*y the whole set of parameters where the vec- 

tors H, a^,p and 77 contain all the parameters to be estimated (a vector of length p{p + 1)), stored 
in an appropriate order. Likewise, let 9 be the corresponding estimator with components ordered as 
the ones of 9. Finally note our abuse of notation: the dependence on n, the number of observations, 
is not explicit in the notation of the estimators. We begin by addressing the almost sure convergence of 9. 



Proposition 3 For any filter a € Ae,q, any set of dilations M and whatever the values of the weights 
Wy,Wc and Wd are, then the vector 9 converges almost surely towards 9, as n — > +00. 



Let us underline that the almost sure convergence holds whatever the number of vanishing moments 
for the filter a chosen, that is for all q > 1, and for all the values of the Hurst exponents. A similar result 
was already proved when p = 1, i.e. for a scalar fBm Proposition 2 (i)). The proposition is proved 

in section (|6.2p . The proof relies mainly on proving the almost sure convergence of C™(/i) to 7™(/i). 

We now state the central limit theorem for the estimators. To state it, we need the additional 
definitions and notation: let 

C„ := (C™(0),C™(0),C^(m^),C™(-m£), (me Al,*,j = l,...,p,j>7))* (30) 
7 = E[C„] = (7,T(0),7"(0),7"M)'7™(-'^^)' (™ € X, 7,j = 1, . . . ,p, j > z))* . 
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In these notation, the vectors of length Dm.p + 3|A^|p(p — l)/2 = |A^|p(3p — l)/2 are ordered 

as fohows: first we put the empirical variances starting with the component xi for all the values of m 
and then the component X2,. • ■ • Then, we place the empirical covariances at lag zero (with the same 
convention: i, j fixed, and m is varying), then the empirical covariances at lag and finally the ones at 
lag —m£. 

Now, we note that the definition of the estimators by equations ([^ [23 HH HZl UHl) define unambigu- 
ously a function g : R^-^ p — j> Rp(p+i) that maps the vector C„ to the vector 6. In the following result, 

the notation -4 stands for the convergence in distribution as n — > +oo. Recall that the quantity iJ^ 
denotes the largest Hurst exponent, that is := maxi=i_..._p 



Proposition 4 Under the notation and assumptions of Proposition with the order of the filter, q, 

satisfying q > + 1 /4, then 

(*) 

v/^(C„-7)4aA(0,E), (31) 

where S is the Dm.p x Dm,p rnatrix explicitly given by '^'^d |^5[ ) v \22\ 
(ii) The vector 6 satisfies 9 — 3(7), g is differ entiahle in 7 and 

^{e-e)^ N{q, Vg(7)SVg(7)*) , (32) 

where V(7(7) denotes the gradient of g (a p{p + 1) x Dm.p rnatrix) evaluated at 7. 



The matrix E is quite complex but it may be evaluated (or at least approximated because it containes 
infinite series) in order to build asymptotic confidence intervals. The most interesting point of this result 
is that the rate of convergence of 9 is the optimal one, ^/n (for the whole set of parameters and whatever 
the values of the weights Wv,Wc,Wd) as soon as q > 2 (see also remark after Proposition [2]). The proof 
of this proposition is given in section [6.31 It mainly consists in establishing (i). The second point {ii) 
will be derived using the classical delta method. We underline that no assumption is made on 7 in (ii). 
This means that the differentiability is true for all the values of the parameter vector 6 (such that the 
models exists). In particular, there is no differentiability problem when py and/or 77^ equals zero. This 
may allow us to use (|32|) to test the absence of correlation at lag zero or to test the time reversibility of 
the process. 

In the next section, we show via experiments that when interested in estimating solely the Hurst 
exponents, the best strategy is to set Wc and Wd to 0. In this case, the gradient vector of g evaluated at 
7 reduces to terms of the form 2^*^ ^"^{o) ^^"^j '^ith little algebra, we may derive the more simple and 
nice central limit theorem 

^{H-H)^M(^0,^^^iIp®LY^{Ip®L)^ (33) 
where S is the Mp x Mp matrix with elements 



2E 



,,^7"4(0)7"i;(0)- 

And we end this section by remarking that when p ~ I, (|33]) is in agreement with Proposition 4 (ii) 
obtained in [Tl] . 
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5 Experiments 



The previous theoretical results are important since they insure convergence at a good rate. However, 
the complexity of the variance terms makes the results difBcult to exploit, especially when we want to 
compare different estimators corresponding to different choices of the weights w>u,c,d- Hence, we now turn 
to some Monte-Carlo experiments to study the estimators, and we present an illustration of the method 
on a high dimensional example. 

5.1 Experimental study of convergence 

Depending on the weights used in the objective function, we may study 8 different situations. However, 
all of them are not a priori useful and do not perform well (a posteriori) . For example, setting Wv = Q 
leads to very poor performance and the corresponding cases are not studied. We limit ourselves to the 
three situations for the estimation of the Hurst exponents: 

1. Wy = \, Wc ~ Wd ~ Q'- this case corresponds to applying the univariate estimator to each of the 
components of the mfBm. To indicate this situation, we add a w in exponent to the estimators. For 
example we will study . 

2. Wv = ^ = Wc, Wd = 0: in this case, we study the advantage of including the empirical correlation in 
the observation set of the linear regression. To indicate this situation, we add a c in exponent to 
the estimators. 

3. Wy ~ 1 = Wc = Wd- in this setting, the empirical measure of asymmetry is also taken into account 
in multiple regression. To indicate this situation, we add a d in exponent to the estimators. 

We have generated 100 snapshots of length n = 1000 for different cases of mfBm: different Hurst 
exponents, different dimensions, different correlation coefficients, for the causal, well-balanced and general 
mfBm. For the causal case, parameter T]ij is dependent on pij, in the well-balanced case it is zero, and we 
have set it to 0.2 x (1 — iJj — iJj ) for all i,j in the general case. The range of the parameters is constrained 
by the existence conditions recalled by equation ([5]) . For each case described, we have evaluated the Mean 
Square Error (MSE) of the estimators in the v, d, c cases reported above. For the correlation and the 
asymmetry coefficient, we have studied the v and d cases only, as well as the empirical estimators, in 
which parameters H^' are used in the renormalization. The hyperparameters used in this simulation study 
are M = {1, . . . , 5} and the generic filter a = dbi corresponding to a wavelet Daubechies filter with two 
zero moments. These choices are guided by the fact that they provided good results when dealing with 
a monovariate fBm, |llj . Other parameters have been tried leading to the same general conclusions. 

The results arc reported in tables I1I2I and |31 The main conclusions from these experiments are the 
following: 

• Regarding the estimation of the Hurst exponents, adding the empirical correlation as an observation 
over which regression is performed does not improve the performance of the scalar estimator applied 
to each of the components. If H^' and H'^ perform equally well for high correlation coefficient, the 
performance of the latter is at least one order of magnitude less than the performance of the former 
when the correlation coefficient goes to zero. The performance of H" appears almost independent of 
p. The same conclusion holds for H'^. However, the estimators including the empirical asymmetry 
is considerably degraded, at least two orders of magnitude worse. 

• The estimation of pij using the renormalized empirical estimator or the regression estimator based 
on the variance data only (when plugging in the estimate of H) leads to the same level of perfor- 
mance. 
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• Parameters rjij are very difficult to estimate, at least with the method adopted here. The difficulty is 
conformed in figure [5] where the MSE is plotted in a log- log plot of the estimated standard deviation 
versus the sample size. If ^/n is clearly observed for the other estimators, it is not (almost) observed 
by fjij, at least for the sample size up to 2-^^ points. 



Thus, in the following, we focus on the convergence of the estimators of the Hurst exponents, the 
correlation and the variance {H, ai, pij, rjij)'". 

We have already remarked that the estimators of the Hurst exponents seem almost independent of 
the correlation. We thus study the behavior of the estimators with respect to the correlation. For p = 2, 
we use Monte-Carlo simulation (1000 snapshot of 1024 samples each, m = 5 dilations used) to plot the 
MSE of each estimator as a function of pi2- Results are displayed in figures In the left plot, we study 
the causal case for Hi = 0.3 and H2 = 0.4, whereas the right plot is concerned with the well-balanced 
case with Hi = 0.3 and H2 = 0.8. In the left plot, the admissible range of p is almost all the interval 
(— 1, 1), whereas it is restricted to approximately (—0.5,0.5) in the case of the right plot. 

The main conclusion of these plots is the fact that the estimation of the Hurst exponents and of the 
variances are almost not dependent on the correlation coefficient between the components of the mfBm. 
However, the quality of the estimation of the correlation coefficient depends on the actual value of the 
coefficient, and depends on it in a rather strange non monotone way. Indeed, the MSE increases with p 
for moderate values and then decreases. 



5.2 A high dimensional example 

As a conclusion we present an illustration of the model and its identification in the context of complex 
networks. Suppose that we observe a, p = 100 dimensional iBm obtained from a graph as follows. Let A 
be the lower triangular part of the adjacency matrix of the graph (i.e. Aij = 1 ^ nodes i and j < i are 
connected, and Aij = otherwise). Then, each non zero elements is given a random value, and we use 
p = {I — A)^^ {I — A)~* as the correlation matrix of the mfBM. The rationale hidden there is the following. 
Let X be a 100 dimensional vector such that X = AX + B where B is 100 dimensional Gaussian random 
vector with zero mean and identity correlation matrix. Then X has obviously p as correlation matrix. 

As underlying graph, we choose a Watts-Strogatz model. A Watts-Strogatz network is a model of 
complex network that jointly presents the property of small-world effect (small mean geodesic distance) 
and the property of high clustering (neighbours of a node are strongly connected) |42j . This model was 
one of the first that adequately described graphs with these two properties. It is in a sense in between 
Erdos-Renyi random graph (low clustering and small mean geodesic distance) and regular grids (high 
clustering but low small mean geodesic distance) . It is obtained by randomly rewiring edges in a regular 
grid. In the example depicted here (see figure |4]), we use a ring of nodes where each nodes is connected 
backward and forward with two neighbors, and each edges is rewired to a randomly chosen node (possibly 
the same) independently of the others with probability 0.2 (self-connections are prohibited). This gives 
the adjacency matrix we use to create a 100 dimensional fBm as described above. 

The resulting sample paths are illustrated in figure Q where we have plotted in some insets some 
components. For example, component 19 with Hurst exponent 0.3 is positively correlated with component 
76, which Hurst exponent is slightly greater than 0.7, but negatively correlated with 18 which Hurst 
exponent is 0.64. Furthemore, since Hig -\- Hjq > 1, the two components are long-range cross-correlated, 
whereas 18 and 19 are short-range correlated. 

We have generated a sample path of length 8192 samples, on which we apply our estimation procedure. 
Since the procedure does not depend on the dimension, we of course obtain good results for the estimation 
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Maan Square Errors 
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Figure 2: Estimated standard deviation of the estimators obtained from the regression as a function of the size 
of the sample. The plot is a log-log plot. The estimators of H is , and this is used in the estimation of the 
others. We observe a clear l/^/n behavior. This rate is confirmed by the theoretical analysis. Note that this 
result is not clear for fj. The sample size should be much gretear to validate or invalidate this rate. Parameters 
chosen here : Hi = 0.3, H2 = 0.8, ai = 2, 172 = 1, P12 = 0.4, r] = 0. 



empiric;3l MSE. H^(Q.3:0 4], empirical MSE. H-[0 3:0.B). ti^2:1J, well-talancad 




Figure 3: Estimated mean square error of the estimators obtained from the regression as a function the correlation 
coefficient. Left plot: Hi — 0.3, H2 = 0.4, ai = 2, (T2 = 1, causal case. Right plot: Hi = 0.3, H2 = 0.8, cri = 
2, (T2 = 1 well-balanced case. The inset depicts a zoom for the MSE of p vs p. Note that in the right plot, p can 
not vary in the whole interval ( — 1, 1) because otherwise the model is undefined. 
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of the parameters. In figure ([5]), we plot the true vector H and its estimation, as well as the correlation 
Pi^i+i and its estimation. We also show in figure ([B]) the true partial correlation matrix and its estimation 
via the inverse of p. Recall that in the Gaussian case, a zero partial correlation between two components is 
equivalent to the independence between the two components conditionnally to the remaining components. 
This could be used to infer dependence link between the components of the process, as is done for example 
in [3] for connectivity studies in the brain j36j . 



6 Proofs 

6.1 Optimization of / 

We differentiate / with respect to all the parameters and set the derivatives to zero to obtain necessary 
conditions for optimality. For parameters a we solve and obtain immediately 

^ TTT\Y. [vk-'^Hklogm] (34) 
= Vk-2HkL, (35) 

where L = (logwi, . . . , logm|A<|)*, Vk = (u"' , . . . , u"'-^' )*, and xj = \M\^'^ J2nieM^T- Likewise, we 
easily get 

Aij = Cij-{H, + Hj)L (36) 

% = d,j-{H, + Hj)L, (37) 

where /iy = (^™^, • . • ,/^™''^')*. Obtaining parameters Hu requires a little bit more work. Differentiating 
/ with respect to Hk and setting the result to zero leads to 



]- («r - 2fffc log m - afe ) (2 log m) + 



]J4\ 51 H (^fcJ " (-^i + ) log m - Ilk J ) log m + 
JXil ^ '^'^^ " + ^'^^ logm - J/fcj) logm = 0. 

Introducing the centered vector x ~ x~ x and replacing in the previous equation parameters a^, fikj and 
i^kj by their estimate (|35l36l37p . we obtain 

2w.u{L'vk - 2HkVL) + Wc{L'jlrj - [Hj + HuJL'L) + Wd{tv^j - {H, + Hk)L'L) = 0. 
Isolating H terms leads to 

d 

Hki^w,, + {p- 2){wc + Wd)) + {wc + Wd) Hj = (L*i)"^i*|2t;fc + ^(u^cCfcj + Wddkj)] 
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Figure 4: Watts-Strogatz graph used to model correlation between the components of a 100 dimensional mfBm. 
The two south-east inset depict components 18 and 19 that are in the example negatively correlated pig, 19 = —0.09, 
with His ~ 0.64 and Hig = 0.30. The north-west inset depict component 76, positively correlated with 19, 
P76,i9 — 0.13, with Hurst exponent Hjq — 0.7. 
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Figure 5: Estimation of the Hurst exponents and of Pi,i+i for the high dimensional example described in Sec- 
tion [5]1 
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Figure 6: True and estimated partial correlation matrix as the inverse of the correlation matrix for the high 
dimensional example described in Section [5.21 
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Collecting the p equations into a vector, we have to solve 

{{Awv + {p- 2)(wc + Wd))Ip + {wc + Wd)Jp)H = X (38) 

where the fcth component oi X is {L^L) ^ L'^ ^Vk + ^j_^^.{wcCkj +Wddkj)^, where Ip is the p dimensional 
identity matrix and Jp is the p dimensional matrix which entries are all equal to 1. Now consider the 
auxiliary result 

Lemma 5 Let A, A' > 0. The {p,p) matrix B = A/p + ^(A' — A) Jp has eigenvalues A and A' of respective 
multiplicity p — 1 et 1. The inverse of B is thus 

,-1 1^ 1^1 1 



^^a^^ + pU^aJ^- 

Applying Lemma[5](for which the proof is omitted) to ([38|) and noticing that for any z e K'-^' , L*z = Uz^ 
we obtain ((23|) . 

6.2 Proof of Proposition [3] 

Proof. The only thing to prove is that for fixed h, for all z, j = 1, . . . ,p and m & Ai, 

Cl]{h) j^{h), (39) 

as n — !■ +00 (the notation stands here and in the following for the almost sure convergence) . Indeed, 
if ([M]) is true, the following convergences hold 

log0.5|7,'^(m^) - jj:im£)\ - {H, + H,)\ogm + v,,. 

Plugging these results in pS)) and noting that L*l = leads with a little computation to the convergence 
of iJfc to fffc for all fc and then to the convergence of Sfe to afe. The convergences of <J^,p and 77 follow 
from their respective definition and (|39|) applied to = 0, ±m£. 

Let us now focus on the proof of ([39]) . Define y(k) = x™(A:)x™(fc + h) and assume that y(-) is 
observed at times 1, . . . , n. This is not a loss of generality since for fixed to, £ and h, n — mi — /i ~ n as 
n ^ +00. Let y £^[F(A:)] = -fljih) and y„ := n^^ X^Li ~ y- F^in Theorem 6.2 of [Hj, p. 492, 
establishing a condition under which almost sure convergence is implied by mean-squared convergence for 
the convergence of empirical means of discrete stationary processes, the proof will be ended if we manage 
to prove that E\y'^] = o(l). Since y{-) is a stationary sequence 

fc=l T=l 

where rj, is the covariance function of y(-) given by ry(r) := E[y{k)y{k + t)]. Using for example Isserlis 
formula, |23j . we can derive 

ry{r) = 7,T(T)7;?(r) + 7™(t + /i)7™(r - /i). (41) 

Proposition [2] (i) states in particular that 7ij(T) = Odrj^'+^J^^*) as |t| — > +00. Moreover, let us recall 
that for H G (0, 1) 



n (1 



0{l/n) if g > 2 or g = 1 and iJ < 3/4. 

©(logn/n) if q = 1 and = 3/4. (42) 



|r I <7i 



(1 + |t|)2(2H 25) 1 (^(^/^2-2if) ifg=landi7>3/4! 
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From (gni) and (gT]), then using Cauchy- Schwartz inequahty and (jH]) (with H = H^.Hj, (iJj + Hj)/2) 
allows us to conclude that -E'[y^J = o(l). ■ 

6.3 Proof of Proposition [J] 

The proof of the central limit theorem is done in three steps. First we prove in Lemma [S] a central 
limit theorem for C™(/i), then a central limit theorem for the vectors containing all the data used in the 
regression, i.e. C™(0), C^[ml) and Cj^{ml), i.e. the proof of Proposition |4] (z). Finally we apply the 
delta method to prove the central limit theorem for the estimators. We thus begin with the first limit 
theorem, stated 0jS cL lemma: 

Lemma 6 Let i,j = 1.. . ,p, m G A4 and h G {0,m£} and let a G Ae.q with q > TaSix{Hi, Hj) + 1/4. 
There exists < +oo such that the following convergence in distribution holds as n — > +oo 

MC^W - j,,{h)) A AA(0,r2). (43) 

We note that a central limit theorem also holds for C^{—mt) since we recall that C'^{—mt) — C'^{mi). 
Proof. From the definition p^. 

n — h 

c^{h)-j:^{h) = ^^^^^^ E K{k)x-^{k + h)^^^{h)] 

^ n—7n£—h 
k=l 

where y{k) = (xf (fc + m£), Xj [k + mi + h) f for k = 1, . . . , 7i — ml — h and where 

/: ^ 

y=(yi,2/2)* ^ ym-i^ih). 

Since Cl'j (h) — (h) can be expressed as a centered empirical mean, the result is based on the application 
of a multivariate central limit theorem for non-linear functional of stationary Gaussian sequences obtained 
by [1], Theorem 4. For this, note first that the Hermitc rank of the function / (in the sense of g]. Equation 
(2.2)) is two. Secondly, the condition on q and Lemma[2](ii) (with a ~ 2) ensure that for any i',j' — i,j 

fcez fcez 

Theorem 4 of |3] can be applied, leading, as n — > +oo, to the convergence in distribution of \/n — mi — /i(C™(/i) — 
7j™(/i)) to a centered Gaussian random variable with finite variance (that we de not want to explicit here). 
The result is obtained since m and h are fixed. ■ 

Now, let us focus on the proof of Proposition |4l 

(i) To prove this convergence, we follow the Cramer-Wold device |17| and prove that for any a S 
jj-Da^.p ^ Q,ty/^((7^^_^) converges in distribution to a*Z where Z is a random normal vector. Let a S R-^^^p 
be decomposed as follows 

a = (a™,"^",""''^,"" G M.,i,j = 1, . . . > i))* , 
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ordered as C„ and 7. Then, 

s„ := a*{Cn-l) 

n 



n — ml 

k=mt+l 



{ 1 " 

+ EE ""^73^ E [C(fcK(fc)-7-(0)] 

^ 71— mi 



'J n- 2ml 

k=mt+l 



Let Af = maxmg^ m (M < +00) and define 

n~Mt 



J2 [xJ'{k)xT{k + ml)~j^{-m£)] 



n — 

■ =Me+ 



^ E EE<K"(fc)^-7"(o)] 
+EE - ^..w] + a:^'+[xr(fc)^™(fc + m^) - 7.,m)] 

+«"'~[a;f (fc)x™(fc + m£) - 7»j(-"J^)] 



Our first aim is to prove that ^/n{sn — s„) — >■ as n — )■ +00 (here and in the following — > stands for the 
convergence in probability). For this, let us decompose the difference — Sn = + <^2,ri where 

'^1- ■= EE;7T^E<[-r(fcf-7"(o)] 

+E E E «r[-r(fcK(fc+-^)-7"M)] 
+E E ^73^ E %-K'(fc).r(fc+m^)-7i;'(-m€)], 

where for m 7^ M, /,„ {m£+l, . . . ,Ml}U{n-Ml, . . . = {m£+l, . . . ,Ml}U{n-Ml, . . .,n-ml} 

and = {n — M£, . . . , n}. The remainder term d2^n is given by 



d^^n := ^^^|eE<^(0)+EE«"^(0) 



2(m- M)^ 
n-2Ml 

j>i m^M 
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Let ei_„ and 62, « be the firt (generic) sum terms of di^„ and d2,n given by 

1 



ei.n 



e2.n 



n - 2Mi 

(m - 2M)£ 
n-2Me 



<C™(0). 



We now prove that \/nek,n — > (fc = 1,2). The other terms follow similar arguments. Since the sum in 
ei^n contains a finite number of Gaussian random variables, Var[y/nei^n] — 0{n^^^^), which implies the 
convergence of ^/nei_n to in and so in probability. Finally, since ^fne^.n = cm ^™_2jv/£^ ^J^^ii (0) ; 
Lemma [S] and Slutsky's Theorem ensure the expected convergence. 

As a consequence of the previous computations, we can concentrate ourselves on the asymptotic 
normality of Sn. For this, let us define y{k) = (a;™(fc + m£ + 1), a:™(fc + 2m£ + 1), (to G M,i = 1, . . . ,p))* 
for fc = 1, . . .,n-2Me and 

fa : M2|A4|p ^ ^ 



+ E,>, E™ KnyTvT - 7."M)] + 

Then s„ can be expressed as the following empirical mean 

1 



'[y'pyT-i'^i-ml)]). 



n-2Me 



2M£ 



k=l 



For any vector a, the Hermite rank of the function fa is 2. Now, similarly as the proof of Lemma |6l we 
notice that for all i, j = 1, . . . mi,TO2 € Al, < +00, 



7y ■ (k + h) =2_^l,j ■ (fc) <+oo. 



fcGZ fcez 
as soon as q > H"^ + 1/4. We can therefore apply Theorem 4 of [1] and obtain as n 



-00 



0,r2 ■.^J2^[faiyit))fa(.y(.t + k))] 



fcGZ 



The variance r is given by t 
into 



Si 


5^12 


^13 


Ei4 


\ 


y* 

^12 


S2 


S23 


S24 




^13 


■^23 


S3 


S34 




y* 


■^24 


y* 
^34 


E4 





Q!*Ea. According to the definition of s„ , the matrix S can be partitioned 



(44) 



The matrix Si is the \A4\p x \M\p covariance matrix of the vector containing the x™{t)'^, Sfe (for 
fc = 2,3,4) are the d^^p x dM.p {dM,p = \M\p{p — l)/2) covariance matrices of the vectors containing 
the a;™(i)a;™(t), the x™(i)a;™(t + mi) and the Xj^{t)x'^(t + mi) respectively. Other matrices are cross- 
covariances matrices (with dimension dM,p x dM,p)- Thus, the dimension of S is Dm.p ^ Dm,p where 
Dm.p \-M\p + 3dM,p = |A^|p(3p— l)/2. Generic elements of these matrices can be evaluated using for 
example Isserlis Formula, [23] . The notation used hereafter follow the ordering of the vector C„ pO|) : the 
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indices i, j, «i, ji, 12, J2 vary from 1 to p (such that j > > ii,j2 > ^2), the indices TOi,m2 vary in A4 



= 2Efce.7r''"^(fc)'' 

= Efea E (*)^"^ (0 - 7"j, (0))(xi:= + fc)a:^7 {t + k) - 7iri. (0))] 

= E..z[7™(fc)7"jr(fc)+7™(fc)7r.r^(fc)]' 

(S3)™ = EfcezE [«ni)^r(^ + - lZni^im^t'{t + fc)^r(* + "^2^ + - 7li:i,(m2^))] 

= Efc«[7ir4''"^(fc)7™rMfc + ("^2 - mi)^) + 7irjr^(fc + ^^^K^ir^^ - mi^)], 

(Si2)™Vr = eLz E f(<^^ it? - 1^1' (0))«" + (t + - CI. (0))] 



2E.ez7:-^^""(fc)7™^'™^(fc), 



(Si3)r/;r = Sfcez E [(x™^ - mK' (t + k)^T it+k+ ^2i) - 7"i. 

2Efcez7™""^(fc)7™^''"^(fc + ^2^), 



(^23)^:1,, = Efclz E [(x™^ (t) - + ^)^r + + fc) - 7S 
= Efeez[7rr^(fc)7™]r(fc+m2^) +7r]r^(fc+"^2^)7",r^(fc)]' 

(^34)!:^;-; = e'L^e t'?™^ (t)x^":^ + m.t) - 7™j, (^1^))^-:^ (i + (t + m2£ + k) - 71;^, (-^2^))] 
= Efcez[7i^r^(fc)7™r^(fc + ("^2 - m,)i) + ^Tjrik + ™2£)7™]r^(fc - ^i^)] 

(m) Recall that the function g : R^-^f maps the vector C„ to 0, i.e. ff(C„) = ^. Now, we 

leave the reader to verify that replacing the vector C„ by 7 allows us to retrieve 9, i.e. (7(7) = 6*. Thus in 
view of applying the delta method [53], we only have to prove the differentiability of g in 7. We do not 
want to provide all these heavy justifications and computations which are not very informative. We only 
focus on the terms that could lead to a problem that is the term related to pij and If \M\ = 1, the 
estimates of pij and r^ij reduce to (j29p and it is simple to check that the function g is diffcrentiable in 7 for 
any 7, which means for any dilations set M and any set of parameters H, a^, p,r] (ensuring the model is 
well-defined) and in particular for some components of p and/or rj set to zero. Due to the absolute values 
used in the definition of pij and rjij ()26l27l28p when > 1, a problem of differentiability could appear. 
We show hereafter that it is not the case. We just choose an example, namely the partial derivatives 
of Pij and assert that the other terms follow similar arguments. We make an abuse of notation in the 
following but we believe the context is clear. So, let us focus on the definition of pij ()26|28p where again 
(only) for the sake of simplicity of the presentation we assume that 7f°^ (0) = 7r°^ (0). Then, let j > i 

:iCn) = j|4(t^n)XBign(C™(0)) 



1 IajI 



And therefore, when evaluated at 7 we obtain 



dC^iO)'" \M\ mH'+"^aiaj\p^j\7TfJ0) \M\ m^»+^.a,(7j7rf.(0) ' 
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which does not involve any continuity problem for the whole set of parameters M,H,a^ , p and rj. In the 
same spirit for example when differentiating pij with respect to C™(0) we get 



dC^iOy 2|A^|Qf(0) 

leading to 



and the conclusion is the same as previously. 
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Parameters 


causal mfBm 


well-balanced mfBm 


general mfBm [rj = 0.2) 


7UV TJC Tjd 

H H H 


TJV TJC Tjd 

H H H 


TJV TJC Tjd 

H H H 


H=0.2 p = 0.1 

(p = 2) 0.5 
U.y 


0.0006 0.0078 0.0126 
0.0005 0.0007 0.0140 
0.UUU5 U.UUUo U.Ui4U 


0.0005 0.0062 0.0203 
0.0005 0.0006 0.0119 
0.UUU5 U.UUU5 U.U148 


0.0005 0.0060 0.0131 
0.0006 0.0008 0.0126 

A AAAC n AAA(^ n A1 /I A 

0.0005 U.UUUo U.U14y 


H=0.5 p = 0.1 

(p = 2) 0.5 

U.y 


0.0009 0.0136 0.0171 
0.0008 0.0011 0.0132 
0.UUU8 U.UOOy 0.U14U 


0.0007 0.0100 0.0204 
0.0008 0.0011 0.0169 

n nnnn n nm n n ni ah 
0.UUU9 O.UUlU U.U146 


0.0010 0.0085 0.0111 
0.0008 0.0012 0.0125 

A AA1 A n AA1 A A A1 OtJ 

U.OOlO U.UUlU U.Ulzo 


H=0.8 p = 0.1 

(p = 2) 0.5 
U.y 


0.0010 0.0085 0.0148 
0.0008 0.0012 0.0099 

n nnnn n nnnn n mn/i 
0.UUU9 U.UUU9 0.U1U4 


0.0009 0.0110 0.0176 
0.0012 0.0015 0.0114 
0.UUU9 0.UUU9 U.U126 


0.0009 0.0157 0.0194 
0.0010 0.0014 0.0114 

A AAAA n AA1 A A A1 

u.oooy U.UUlU U.Ulzo 


H=0.1:0.5 p = U.l 

{p = 2) 0.5 
U.9 


0.0008 0.0048 0.0143 
0.0006 0.0007 0.0116 

XXX 


0.0007 0.0054 0.0155 
0.0007 0.0010 0.0113 

XXX 


0.0006 0.0052 0.0139 
0.0007 0.0010 0.0154 

XXX 


H=0.5:0.9 p = U.l 

(p = 2) 0.5 
0.9 


0.0009 0.0125 0.0116 
0.0008 0.0009 0.0090 

XXX 


0.0009 0.0073 0.0125 
0.0009 0.0011 0.0099 

XXX 


0.0008 0.0035 0.0135 
0.0009 0.0011 0.0112 

XXX 


H=0.2 p = U.l 

(p = 5) 0.5 
U.y 


0.0005 0.0130 0.0255 
0.0006 0.0011 0.0234 

U.UUUO U.UUUO U.UzoZ 


0.0005 0.0135 0.0270 
0.0005 0.0011 0.0225 
U.UUUO U.UUUO u.uzou 


0.0006 0.0141 0.0243 
0.0006 0.0010 0.0226 

U.UUUO U.UUU / U.Uz4o 


H=0.8 p = 0.1 

(p = 5) 0.5 
0.9 


0.0010 0.03050 0.0265 
0.0009 0.0017 0.0202 
0.0009 0.0010 0.0199 


0.0008 0.0292 0.0239 
0.0011 0.0021 0.0185 
0.0011 0.0013 0.0214 


0.0009 0.0255 0.0249 
0.0011 0.0021 0.0198 
0.0010 0.0011 0.0192 


H=0.1:0.5 p^O.l 
{p = 5) 0.5 
0.9 


0.0006 0.0163 0.0272 
0.0007 0.0014 0.0220 

XXX 


0.0006 0.0172 0.0283 
0.0006 0.0012 0.0247 

XXX 


0.0006 0.0143 0.0251 
0.0006 0.0013 0.0229 

XXX 



Table 1: Empirical means of MSE of H estimates based on 100 replications of causal, well-balanced and 
general mfBm of length n ~ 1000. For the general case, the parameter 77^- is fixed to .2 x {\ — Hi — Hj) 
for i > j and ~ -~f]ij- The letters v,c,d respectively correspond to the Hurst exponents estimators 
computed with the weights w = (1,0,0) (using only the empirical variances) , w = (1,1,0) (using 
in addition the empirical covariances) and w = (1,1,1) (using in addition the difference of empirical 
covariances at lags ±m). 
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Parameters 


causal mfBm 


well-balanced mfBm 


general mfBm (77 = 0.2) 


P P 


P P 


P P 


H=0.2 p = 0.1 

{p = 2) 0.5 

u.y 


0.0013 0.0013 
0.0007 0.0007 

A AAA1 A AAA1 
O.OOOi U.OOOi 


0.0012 0.0078 
0.0007 0.1878 

A AAA1 A ^^AOA 
U.UUUl U.DU8U 


0.0014 0.0078 
0.0007 0.1878 

A AAA1 A CJAOA 

U.UUUi U.6O0O 


H=0.8 p = 0.1 

(p = 2) 0.5 
o.y 


0.0017 0.0017 
0.0011 0.0041 

A AAA1 A AAAT 

O.UUUi U.UOU( 


0.0022 0.0080 
0.0011 0.1883 

A AAA1 A t^r\0 A 
U.UUUl U.DU84 


0.0022 0.0081 

0.0013 0.1890 

A AAA1 A (^AOO 

U.UUUi U.60o3 


H=0.1:0.5 p = 0.1 

(p = 2) 0.5 

A n 

O.y 


0.0014 0.0013 
0.0005 0.0007 

X X 


0.0010 0.0078 
0.0005 0.1878 

X X 


0.0009 0.0077 
0.0007 0.1879 

X X 


H=0.5:0.9 p = 0.1 

{p = 2) 0.5 

A n 

O.y 


0.0012 0.0027 
0.0037 0.0411 

X X 


0.0010 0.0082 
0.0023 0.2050 

X X 


0.0011 0.0083 
0.0028 0.2029 

X X 


H=0.2 p = 0.1 

(p = 5) 0.5 
U.y 


0.0011 0.0011 

0.0007 0.0008 

n nnni n nnnQ 
U.UUUl U.UUUo 


0.0012 0.0065 
0.0006 0.1503 
U.UUUi U.4oDy 


0.0013 0.0065 
0.0008 0.1504 
U.UUUi U.4o04 


H=0.8 p = Q.l 

{p = 5) 0.5 

O.y 


O.OOiy 0.0039 
0.0010 0.0686 
0.0001 0.2060 


0.0019 0.0075 
0.0010 0.1753 
0.0001 0.5683 


0.0018 0.0076 
0.0012 0.1765 
0.0001 0.5693 


H=0.1:0.5 p = 0.1 

{p = 5) 0.5 

O.y 


0.0013 0.0013 
0.0006 0.0010 

X X 


0.0013 0.0065 
0.0007 0.1514 

X X 


0.0013 0.0065 
0.0007 0.1506 

X X 



Table 2: Empirical means of MSE of the estimates of pij based on 100 replications of causal, well-balanced 
and general mfBm of length n = 1000. For the general case, the parameter rjij is fixed to .2 x [1 — Hi — Hj) 
for i > j and r/ji = The letters n, d, v correspond to different strategies detailed in Section [5] 
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Parameters 




causal mfBm 


well-balanced mfBm 


general mlBm [rj = 0.2) 










^" 






rf 






H 


=0.2 


P 


= 0.1 


0.2081 


0.2663 


0.1834 


0.3598 


0.1169 


3.8969 




= 2) 




0.5 


0.1513 


0.4219 


0.1315 


163.75 


0.0862 


0.3576 








o.y 


0.0401 


1.3426 


0.0366 


0.0521 


0.0112 


0.7055 


H 


=0.8 


P 


= 0.1 


o.ieys 


15.6143 


0.1240 


0.1985 


0.0908 


0.4624 


(P 


= 2) 




0.5 


0.1181 


0.1155 


0.1058 


0.1124 


0.0545 


0.2181 








O.y 


0.0274 


0.0312 


0.0353 


0.0609 


0.0065 


0.0797 
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=0.1:0.5 


P 


= 0.1 


0.2416 


26.3081 


0.3332 


4.6943 


0.2333 


8.3665 




= 2) 




0.5 


0.0507 


0.4746 


0.2185 


684.53 


0.1839 


2.0666 








A n 

O.y 


X 


X 


X 


X 


X 


X 


H 


=0.5:0.9 


P 


= 0.1 


0.2472 


1.6664 


0.2067 


1.6916 


0.0999 


1.6900 


{v 
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0.5 


0.9107 


15.6106 


0.0927 


0.9720 


0.0519 


0.6583 








A n 

O.y 


X 


X 


X 


X 


X 


X 


H 


=0.2 


P 


= 0.1 


0.1919 


0.8977 


0.1954 


16.3425 


0.1095 


0.4842 


\P 
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0.5 


A 1 /I .1 A 

0.1440 


A n /I o /I 
0.9434 


A 1 C A1 

0.1501 


A O C 1 

0.6351 


A A'TAA 
U.O ^90 


61.6(16 








u.y 


0.0362 


1.3560 


0.0394 


0.2708 


0.0096 


0.4003 


H 


=0.8 


P 


= 0.1 


0.1418 


3.8831 


0.1503 


0.3738 


0.0789 
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= 5) 
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0.1126 


0.3640 


0.1154 


0.2667 


0.0522 


0.2319 
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0.1154 
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0.0074 


0.0494 


H 
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P 
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2.8168 


900.01 


1.6256 


96.4169 


2.5045 


712.66 


(P 
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30.279 


259.83 




2.7290 X 


14.264 


1455.58 








o.y 


X 


X 


X 


X 


X 


X 



Table 3: Empirical means of MSE of the estimates of rjij based on 100 replications of causal, well-balanced 
and general mfBm of length n = 1000. For the general case, the parameter rjij is fixed to .2 x {\ — Hi — Hj) 
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